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Abstract 

For the class of stationary Gaussian long memory processes, we study some properties of the least- 
squares predictor of X n +± based on (X n , . . . , X±). The predictor is obtained by projecting X n+ i onto 
the finite past and the coefficients of the predictor are estimated on the same realisation. First we prove 
moment bounds for the inverse of the empirical covariance matrix. Then we deduce an asymptotic 
expression of the mean-squared error. In particular we give a relation between the number of terms 
used to estimate the coefficients and the number of past terms used for prediction, which ensures the 
L 2 -sense convergence of the predictor. Finally we prove a central limit theorem when our predictor 
converges to the best linear predictor based on all the past. 

Keywords : linear prediction, long memory, least-squares predictor based on finite past, Toeplitz 
matrix 

1 Introduction 

Consider (Xt)tez a stationary process with zero mean and finite variance. We wish to predict X n+ \ 
from the observed past (Xi, . . . ,X n ) using a linear predictor i.e. a linear combination of the observed 
data. First we define the coefficients of the optimal predictor in the least squares sense assuming that 
the covariance function is known. Then we need to estimate the replace coefficients. This second step 
is often realised under the following restrictive hypothesis: we predict another future independent series 
with exactly the same proba bilistic structure ; the observed series is onl y us ed to c o mpute the forecast 
coefficients (see for example Bhansali ( 19781 ). Lewis and Reinsel ( 1985 ) or Godet ( 2007bl )). This as- 



sumption makes the mathematical analysis easier since the prediction problem can be reduced to an 
estimation problem of the forecast coefficients by conditioning on the process, which we forecast. But 
the practitioner rarely has two independent series: one to estimate the model, one to predict. He has 
to estimate the forecast coefficients on the same realisation as the forecast one. In the following we 
concentrate on this case called same-realisation prediction. 

The performance of the predictor depends on two parameters: the dimension of the subspace on which we 
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project and the number of available data to estimate the forecast coefficients. To reduce the prediction 
error, it is reasonable to increase the dimension of the space onto which we project, as more and more 
observations become available. But when the dimension and then the number of forecast coefficients 
increase, the estimation of these coefficients becomes more difficult and can affect the mean-squared 
error. 

When the spectral density of the pro cess (Xt)t^7. ; e xists, is bo unded and bounded away f rom (this 
is typical of the short memory case) Ing and Wei ( 20031 ) and Kunitomo and Yamamoto ( 19851 ) have 
studied the mean-squared prediction error for same-realisation prediction. The mean squared error for 
same-realisation prediction can be approximated by the sum of two terms: one due to the goodness of 
fit and one due to the model complexity. In the short-memory case, it is interesting to remark that the 
approximations of the mean-squared error for same and independent realisation prediction are the same. 
The performance of the least-squares predictor of long memory time series is still left unanswered. And 
in this case, the asymptotic equivalence between the mean-squared error in same and independent re- 
alisation should not be taken for granted since the autocovariance function decays more slowly than in 
the short memory case. 

The paper is organised as follows. In Sections [2] and [3] we generalise the results of llng and Wei (|2003h to 
find an asymptotic expression of the mean-squared error for long memory time series. The mean-squared 
error is approximated by the same function as in the short memory case but under more restrictive con- 
ditions on the number of available observations and on the model complexity. In the last section, we 
prove a central limit theorem. More precisely, we prove the convergence in distribution of the normalised 
difference between our predictor and the Wiener-Kolmogorov predictor, which is the least-squares pre- 
dictor knowing all the past. The normalisation is different from the short memory case since it is given 
by the goodness of fit of the projection. 



Definition of the Predictor Let [Xt)t&L a stationary process with zero mean and finite variance. 
We assume that the autocovariance function a of the process is known. Our goal is to predict X n+ i, 
using the k previous observed data. The optimal linear predictor is defined as the projection mapping 
onto the closed span of the subset {X n , . . . , X n ^t+i} in the Hilbert space L 2 (Q, J 7 , P) with inner product 
< X, Y >= 'K(X'Y) where X' denotes the transpose of the vector X. It is the least-squares predictor 



knowing (X n _ fcH 
coefficients i.e.: 



i, . . . , X n ). We denote by X n+ i(k) this predictor and by — a,- & the theoretical prediction 



X n +i(k) — (— a j,k) X n+ \-j. 
They are given by (see Brockwell and Davis ( 19881 ) Section 5.1): 



(1) 




V o(k) 



(2) 



where £(&) is the covariance matrix of the vector (X\, . . . , X^). 



Estimation of the Forecast Coefficients When the autocovariance function a of the process (Xt)tez 
is unknown, we can plug- in an estimate of the prediction coefficients (— ctj jt) m dTj)- The estimate is 
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constructed from the last n observations (X n , . . . ,X±) and our predictor is the projection of the last k 
observations (k < n). The covariance matrix is estimated by: 

1 n 

:= n _ K Zl £ X #KW (3) 

where 

X;.(/c):=(^,...,X w ) (4) 

and where if n is the maximum dimension of the subspace, onto which we project i.e. we will study 
the family of predictors (X n+ i(k))i<k<K n - K n will be an increasing sequence of integers which can be 
bounded or can go to infinity. 

The prediction coefficients a^k are estimated from (X±, . . . ,X n ) by: 

j Tl—l 

a'(fc) = (-oi ifc , . . . , -a k)k ) = S'^fc) _ ^ Xj(A;)X j+ i. 

The resulting one-step predictor is: 

X n+1 (fc) = X^(fc)a(fc) (5) 

In this paper we use C to denote generic positive constants that are independent of the sample size n 
but may depend on the distributional properties of the process (X t ) te i- Moreover C may also stand for 
different values in different equations. 

The following assumptions on the process (X t )t<^z are essential to the results presented in the paper. 
There exists d e]0, l/2[ such that: 

H.l The stationary process (X t ) te z is Gaussian and admits an infinite moving average representation 
and an infinite autoregressive representation as follows: 

+oo +oo 

£t = ^ a j x t-j and X t = ^ &j£t-j (6) 
i=o j=0 

with ao = 6o = 1) for any j > 1 and for any (5 > 0, |a,| < Cj~ d ~ l+S and < Cj d ~ 1+S and (et)t e z 
is a white noise process. These assumptions on the coefficients are verified by both long memory 
and short memory processes; 

H.2 The covariance cr(k) is equivalent to L{k)k 2d ^ 1 as k goes to infinity, where L is a slowly varying 
function (i.e. for every a > 0, x a L(x) is ultimately increasing and x~ a L(x) is ultimately decreas- 
ing) . Under this assumption the autocovariances are not absolutely summable and thus the process 
is long memory process; 

H.3 The spectral density of the process (X t ) te z exists and has a strictly positive lower bound; 

H.4 The coefficients (aj)j^ verify: 

aj . ~ mr d - x (7) 

with L a slowly varying function. 
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For example, the assumptions H.1-H.4 hold for the most studied long memory process, the Gaussian 
FARIMA process, which is the stationary solution to the difference equations: 

<t>{B)(l-B) d X n = 0{B)e n (8) 

where (e n )nez is a white noise series with mean zero, B is the backward shift operator and <fi and 9 are 
polynomials with no zeroes in the unit disk. 

We only use assumptions H.1-H.3 to give an asymptotic expression of the mean-squared error of the 
predictor. Assumption H.4 is a more restrictive assumption used to prove a central limit theorem for 
our predictor. 

Assu mption rl.2 d oes not imply the bound on the coefficients («j)jeN and (6j)jeN given in assumption 
H.l. ilnouej l|2000l ) has proved that the asymptotic expression of the autocovariance cr(k) ~ L(k)k 



implies: 



and 




as j — ► +00 



if we assume that the sequences (bj)j^ and (aj)j g N are eventually decreasing to zero and bj > for all 
j G N. Such assumptions on the sign of the sequence (6j)j e N or its monotonicity are not necessary for 
example to prove Lemma |2. II and to find moment bounds for the inverse sample covariance matrix. 



2 Moment bounds 



In this section, we establish moment bounds for the inverse sample covariance matrix and apply these 
results to obtain the rate of convergence of Ti n {k) to 

Throughout the paper, \ m i n (Y) and \ ma x(Y) are respectively the smallest and the largest eigenvalues 
of the matrix Y. We equip the set of matrices with the norm 

\\Y\\ 2 = X max (Y'Y) (9) 

(see for example bahlhausl dl98fll ^. For a symmetric matrix, this norm is equal to the spectral radius 



and for a vector (X\, . . . ,X n ), it is equal to y X^=i ^f- This norm is a matrix norm that verifies: for 
any matrices A and B: 

\\AB\\ < \\A\\\\B\\. (10) 

Lemma 2.1. Let (K n ) n ^ an increasing sequence of positive integers satisfying K n = o{^/n). Assume 
(H.l). Then, for any q > 0, for any > and for any 1 < k < K n , 



E 



a: 



E n (fc) 



where S n (/c) is defined in 



Proof. The sketch of the proof is the same as that of Lemma 1 of ling and Weil (|2003l ) . The arguments 
are the following: 



4 



1. the series X^j=i \ a j\ converges; 

2. the cumulative distribution function of the random variable £j is a Lipschitz function and we may 
choose a Lipschitz constant independent of t. For any integer t and for any reals x and y, there 
exists C independent of t such that: 

|P(et < x)-P(e t < y)| < 

In our context these two conditions are satisfied. The sequence (aj)jeN is summable under assumption 
H.l. Since we have assumed that the process (JQ)t e z is Gaussian, (et)tez is a sequence of independent 
and identically distributed Gaussian random variables. The distribution function of the process e% is 
independent of t and is a Lipschitz function. □ 

For n sufficiently large Lemma [2.11 guarantees that almost surely exists as the minimum 

eigenvalue of S n (/c) is almost surely positive. We also obtain an upper bound for the mean of the 
maximum eigenvalue of S~ 1 (/c). But this upper bound is not uniform as k — > +oo and therefore does 
not provide an asymptotic equivalent of the prediction error. Nevertheless the bound given in Lemma 
12.11 is a the basis of the following theorem. 

Theorem 1. Assume that the process (Xt)t£i verifies the hypotheses H.1-H.3 

• if d e]0, l/4[ and if there exists 5 > such that K^ +S = 0(n) then for all q > and for all 
\<k< K n : 

EH^wr = ou) (ii) 

and 

-2 \ 9/4 



EUS-^fc) - S-^lK 2 < C (12) 

\n - K n + 1/ 



for sufficiently large n ; 



if d 6]l/4,l/2[ and if there exists 5 > and 5' > such that Kl +& = 0(n 2 " 4d -' 5 ') then for all 
q > and for all 1 < k < K n : 

E\\L-\kW = 0{l) (13) 



n 

and 



/or sufficiently large n; 



(14) 



if d = 1/4 and i/ iaere exists 5 > and 5' > sac/i i/iaf -f^ +<5 = 0(n* 5 ) i/ien /or all q > and 
for alll<k< K n : 

Elisor = 0(1) (15) 

and 



(16) 



E|&*(*) " S- 1 ^)!!^ 2 < C f ^ 2 ("-fn + l)log(n-^ w + l 

V (n-A n + l) 

/or sufficiently large n. 
In the proof of Theorem [H we need the following lemma. 
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Lemma 2.2. // the process (X t ) te % verifies (H.2), if I < k < K n and 
• if d e]0, 1 /4[, then for all q > 0, 

/ K 2 
E||E n (fc) -Z(k)\\ q <C' 

ifd G]l/4,l/2[, then for all q > 0, 



n - K n + 1 



(17) 



E||E n (fc) -E(fc)|| 9 < C 



(n - ir n + l) 2 " 4d 



(18) 



if d = 1/4, t/ien /or a// g > 0, 



E||S n (fc) - £(A;)|| 9 < C 



if 2 L 2 (n - JT n + 1) log(n -K n + 1) 
(n-K n + 1) 



(19) 



Proof. We only prove the inequalities (fTTj) . (fl~8|) and (fT9j) for q > 2. T he general cas e (g > 0) easily 
follows from Jensen's inequality. We consider the matrix norm ||.|| B fsee ICiarletl (|l982h l defined for all 
matrix Y = (yi,j)i<i,j<k b Y 



\Y\ 



k k 



Since the matrix E n (fc) — S(/c) is symmetric, we have 

||S n (A;)-E(fc)|| < ||E n (fe)-i:(*;)|b. 

We obtain: 

||E n (fc)-E(fc)||« < [|E n (*;)-E(A;)|||, 



g/2 



(fc fc 

where <7jj and <r(i — j) denote respectively the (i,j) entries of the matrices E n (fc) and E(fe). 
Applying Jensen's inequality to (j20l) because g/2 > 1, we have: 



(20) 



k k 



It follows that: 



i=l j=i 



fc fe 



E[|E ft (fc) - E(&)f < ^ 2 ^ ^ E|S^ - a(i - (21) 

i=l j=i 

Now we derive the limiting distribution of — o~{i— j) to find an asymptotic expression of E|<7£ j — <r(i — 
j)! 9 . We shall work with the definition of the empirical covariances. By ([3]), we have: 

n-Kn+l 



h3 



n — K, 



h+l ^ X| *-l*J+*-l c n-K n + l £ *J*l+*-i- 

l=K n 1 = 1 



(22) 
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where the second equality is ensured by the strict stationarity of the process. Without loss of generality, 
we assume j > i. The right term of (|22p can be written: 



I n-K n +l j 

_ 7^—7 x i x i+j-i = _ 7? T 1 X 'l ( n ~ R n + 1 + j - *)2ijXi (n-K n + l+j-i) (23) 

where Xi(n — X n + 1 + j — i) is defined in ((4j) and the entries of the matrix T^j verify 

ti,j {s, tj 



1/2 j£\a-t\=j -i 
otherwise. 



Tij is a Toeplitz matrix because it has symbol gij(x) = cos ((j — i) x) i.e. tij(s, t) = J_ gtj(x) cos(t 
s)x)dx. 

Under Assumption H.2 with d G]0, l/4[, the spectral density verifies in a neighbourhood of 0: 



fix) = O (x- 1 ' 2 ) 



(see Zvgmund ( 19681 ) Chap. 5 Theorem 2.6). By applying Theorem 2 of Fox and Taqqu ( 19871 ) to ([23j) . 
we obtain the following convergence: 

in - K n + 1) - - i} ) _ ^ / ^ r cQg2 _ X 



V 7 ^ - K n + 1 + j - i n->+oo ' 

where ==>■ denotes the convergence in distribution. This convergence in distribution follows from the 
convergence of all the cross-cumulants and hence the convergence of all the moments of the left term of 
(|24p . From this convergence in distribution we can deduce an asymptotic expression of the moments of 
&i j — cr(i — j). If q is even, we have an asymptotic equivalent as n — > +oo: 

E|*y-a(i-j)|« ~ ( Vn ~ Kn + 1 + j ~ -) 9 E[\Y\''], (25) 

ra->+oo \ n — K n + 1 / 

where Y is a Gaussian random variable which has for probability distribution the right term of ([2 
The gth-order absolute moment has the form: 



£ " y i'i=i4^ 4 (26> 

Moreover notice that for all 

a Y < ^4tt ^ / 2 (A)dA := M. (27) 
Thus ([25]) . ([26]) and ([27]) imply for sufficiently large n: 



y/n-K n + l+j 



r! 



Vn-K n + l 
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with C independent of The result (|17p follows from the previous inequality and inequality (|2ip 

for q > 2. 

For any d e]l/4, l/2[, we apply the proposition of Rosenblatt ( 19791 ) which gives the following 
convergence in distribution: 



(n- K n + 1) (frjj - oji - j)) 
L(n -K n + l)(n - K n + 1 + j - i) 2d 



n— >+oo 



where 1Z is a Rosenblatt process. This convergence in distribution is obtained by proving the convergence 
of the cumulants of any order and hence the convergence of the moments of any order. This limit does 
not depend on the difference (j — i). Similarly to the proof of (I25P we show for any even integer q: 



ma 



<?{i-j)\ q <C 



Ljn -K n + l) 
[n-K n + 



where C does not depend on {j — i). This inequ ality and (I2T1) y ield the desired result. 

Finally for d = 1/4, we apply Theorem 4 of iHoskina (|1996l ). which gives the following convergence 
in distribution: 



(n-K n + 1) 



log(n- K n + l)L*(n- K n + 1] 



- a(i - j)) M(0,a 2 



n— >+oo 



where a depends on the process (Xf)tez but not on This convergence in distribution is obtained 

by proving the convergence of the cumulants of any order and hence the convergence of the moments. 
We then obtain that for any even integer q: 



E|<7. 



'■j 



a(i-j)\*<C 



L 2 (n - K n + 1) log{n -K n + 1) 



q/2 



(n-K n + 1) 

with C independent of (i, j). The result (fT9l) follows from this inequality and (f2T|) . 

We now prove Theorem [TJ 
Proof of Theorem{J\ Since ||.|| is a matrix norm (see f|10[) ) . 

HE- 1 ^) - Z-'ikW < \\^n\k)\\ q l|S„(fc) - Eik)\m~ 1 (^ q 



□ 



Furthermore, by a ssumption H.3, the spe c tral d ensity of the process (Xtjtez has a strictly positive lower 
bound. Thus from Grenander and Szego! (jl95«l ). there exists a constant C such that for all n > 0: 

(28) 



lis- 1 ^)!! 9 < c. 

Using Holder's inequality with 1/p' + \ jq' = 1, we obtain: 

EUS; 1 ^) - YT\h)f < C (iHE-^P') 179 ' (E]|S„(fc) - 
By Lemma 12.1^ we have for all 9 > and for large n: 



A VP' 



EE 



< C(A: 2+e ) 9 
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then 



EUS; 1 ^) - X-\k)\\ q < C(k 2+e y (E||S n (fc) - E(fc)|| 



IIP 



A W 



(29) 



We now apply Lemma 1 2 .21 In order to treat together the three situations d g]0, l/4[, d G]l/4, l/2[ and 
d = 1/4, we define /i(n) by: 



h(n) 



For large n, we then obtain: 



n - K n + 1 

if 2 L 2 (n-if n + l) 
(n - K n + l) 2 " 4rf 

if 2 L 2 (n -K n + 1) log(n - if n + 1) 
n - K n + 1 

„>\ Vp' 



if d€]0,l/4[ 
if d€]l/4,l/2[ 
if d = 1/4 



E[|E n (fc)-E(fc)[| ap ') P < C(/i(n))«/ 2 . 



(30) 

From inequality ([29]) and the bound ([30]) . we obtain that there exists 6 > such that for sufficiently 
large n: 

EHE- 1 ^) - ^-\k)\\ q < C{k A+e h{n)) q ' 2 . (31) 
By inequalities ([3"T]) and ([28]) . we have: 



EE 



-1 



q < C 1+ fc 4+& /i(n) 



(32) 



This inequality is not sufficient to obtain (jlip and f|13|) since under the assumptions of Theorem [H 

(/c 4+6, /i(n))' 5 ^ 2 is not necessarily bounded. We have to improve the intermediate inequality (|32p . 
The Cauchy-Schwarz inequality and (1281) give: 



E||E-i(fc) - S-HA:)ir /2 < C (E||E-i(fc)||«) 1/2 (e||E„(A;) - E(*)||«) 
And there exists C > independent of q such that: 

EW^-^k)^ 2 < C (E||E- X (fc)f / 2 + EHS-^fc) - E-VMII9/ 2 
Inequalities ([MI), P5|). <[52j) and Lemma O imply that: 



1/2 



EUS-^A;)!! 9 / 2 < C( 1 + (V+^( 



n 



q/4 



(33) 
(34) 

(35) 



Repeating s — 1 times this argument (i.e. using inequalities 
has for large n: 



Ells: 



q/2-( s + 1 ) 



([33]) and Lemma [22]) i one 



(36) 



By assumption there exists 5 > such that h{n)k s converges to as n tends to infinity, therefore there 
exists s, such that E||E~ 1 (fc)|| 92 8 is bounded. Since q in ([36]) is arbitrary, ([TT]) and ([T3]) are proved. 
Inequalities (j!2p and (|14p follow from (|33|) and from Lemma 12.21 □ 



In the following section, we establish an asymptotic expression for the mean-squared prediction error 
of the least-squares predictor using the sharp upper bound for E||S~ 1 (A:)||' ? given in Theorem [TJ 
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3 The mean-squared prediction error of the least-squares predictor 

In this section, our goal is to give an asymptotic expression of the mean-square prediction error of the 
predictor defined in ([5]). First we decompose the forecast error in: 



X n+ i - X n+ i(k) = e n+ i + f(k) + S n (k) 



(37) 



where is the innovation white noise at time n + 1 and cannot be forecast, S n {k) is the error due 
to the projection onto the closed span of the subset X n , . . . ,X n -f.+li and f(k) is the error due to the 
estimation of the prediction coefficients. More precisely if we set ajfc = for i > k (for j < k, the 
coefficients are defined in (J2J)), we have 



Sj(k) = -^(0, - a ijk )X j+ i-i 



(38) 



i=l 



and with £j+i t k equal to the forecast error of Xj + \ due to the projection onto (Xj, . . . , Xj-k+i) i.e. 

k 

£ j+l,k = - P[X ] _ k) ...,Xj]{ X j+^) = X 3+± + a l,kXj + \-l, 



we have 



1 n—l 



j=K n 

where X^(fe) is defined in (jlj). 

In view of (|37p , we obtain the decomposition of the mean-squared prediction error as the sum of the 
variance o"g of the white noise and the error due to the prediction method E (f(k) + S n (k)) 2 : 

E (X n+1 - X n+ i{k)f = a 2 e + E (f(k) + S n (k)) 2 . 

Theorem 2. Under assumptions H.1-H.3, if we choose the sequence (-PC n )neN such that for some 5 > 0: 

K = o(n 1 " 2d - 5 ), (39) 

then 



lim max 

n— >+oo Kk<K n 



E ( Xn+l- X n+ i (fc)) -a 



L n (k) 



1 







where 



S n (k) being defined in (138p . 



L n (k) =E(S n (k)) 2 + 



n - K n + 1 



-cr: 



(40) 
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Remark If we fit a misspecified AR(fc) model to the long-memory time series (A n ) ng 2 to forecast it, 
we find the same predictor as ([!]). Consequently L n (k) can be viewed as the quality of prediction by an 



AR model. From (|40p . this quality is the sum of the model complexity 
fitE(5 n (fc)f. 



a e and the goodness of 



Proof. By (I37|) . we have: 



L n (k) 



Our proof is divided into three steps: 



E(X n+1 -X n+1 (k)) -a 



E(f(k)+S n (k)Y 



L n {k) 



1. we provide an approximation of E(f(k)) 2 which is easier to estimate. This approximation denoted 
by E(/i(A;)) 2 will be defined in (|4TD : 



2. we show that the asymptotic equivalent of E(/i(&)) is n _ K + ± v e , 

3. we prove that the cross-product term ¥,(f(k)S n (k)) is negligible with respect to L n (k). 



First step We introduce 

h(k):=-Xi(k)E-\k) _\ jr Xj^ej+u 



with 



f y/n/2-K n 



n—y/n—1 
j=K n 

^/2-K n 



(41) 



jc n -j, . . . , 



i=o j=o 
Lemma 3.1. If the assumptions of Theorem^ hold, then 



lim max E ( i / — r — 

n^+oo i<fc<_ft- n \ y L n (fc) 



=0. 



Proof. See the appendix. 



Second step We prove that 



lim max 

n— >+oo Kfc<A' n 



E 



n - K " + 1 /^) 



(42) 
□ 

(43) 
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First observe that 



E 



n — K n + 1 2 



fccr 2 



n — K n + 1 
had 



n-K n + l 

fc<7? 



n — K n + 1 



, n-T/n-l 

e x; , (fc)s- 1 (fc) n _ K £ x j (fc) £j - +1 



E 



n— Vn— 1 



trace X;'(fc)S- 1 (fc) ra _^ + 1 ]T X,(fc) £j+1 



trace 



n— */ra— 1 



, n— v/n— 1 

E|S- J (fc) (n _^ + 1) 2 £ X j (fc) £j+1 £ X'K^emS- 1 

j=K n l=K n 



n— y/n— 1 

Since the vector X* and X.j(k)ej + \ are uncorrelated because k < K n : 

j=K n 



E 



n - K n + 1 r2 



/f(*) 



n - K n + 1 



rtrace (S _1 (A;)(n - K„ + 1 - >/n)<Tf E(fc)E _1 (A;)E*(A;)) 



fca e 2 (n-K n + l) 2 
trace (E _1 (A;)E*(A;)fc" :l ) (n - if n + 1 - y/n)(n -K n + 



where E*(fc) is the covariance matrix of the vector X* (k). We note that: 

(n — K n + 1 — y / n)(n — K n + — > 1 as n — > +oo. 

So we only have to study the trace of (E _1 (£;)E*(£;)A; _1 ) . We will use the following inequality: for all 
k x k matrices A and B 

|trace(AB)| < y/tvao&{AA') y/tiace(BB') 

We obtain: 

max ItracefE^fAOE*^)*^ 1 ) - l| = max Itrace (TT l (k)(^T (k) - ^{k))k' 1 ) I 

l<k<K n 1 v 7 1 l<k<K n 1 v y| 

< max ||S^)||||(S*(A;)-E(A0)|| 

1 _^ /i' _^ j V 7-1, 

< max ||E _1 (A;)|| max ||(E*(fc) - S(fc))|| 

E(fc) — E*(/c) is symmetric because E(fc) and E*(fe) are two symmetric matrices, and its spectral norm is 
lower than every other matrix norm. We use the subordinate norm defined for all matrix Y = (yi j)i<% j<k 
by: 

k 

\\Y\\i = max^ \y i:j \. 
3 i=i 
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For large n, we obtain 



max IIS (k)\\ max ||S*(fc) — E 

Kk<K n Kk<K„ 



for all 5 > 0. Then 



max IIS 1 

l<fc<A" n 



follows from condition (1391). 



< max IIS 1 

Kk<K„ 



< max IIS 

l<£,<A" n 



max ^(^-S*^)!!! 

l<K<A n 

+ 0O 

max /c max 



l<k<K n 0<j<k- 



\bih 



l=^E/2-K„+l 



o 



(v^) 



1— 2d— <5 



max ||S*(fc) - E(jfe)|| = o(l) 

l<k<K n 



Third step We consider the cross-product term E (/ ' (k)S n (k)L n l (&;)) and show that it is negligible. 
Ing and WeJ (|2003h proved that 



|E (f(k)S n {k)L~ l (k)) | = |E ((/(&) - h{k))S n {k)L-\k)) \ 
By the Cauchy-Schwarz inequality: 



< 



By (|I2|). we obtain: 



max |E ((/(&) — f\{k))S n (k)L~ 1 (k)) | 

l<fe< A n 



i<m ax E((/(fc)-/ 1 (A;)) 2 L r ; 1 ^)) i max E (S^L" 1 ^)) 



max E ((/(A;) - f 1 (k)) 2 L^ l 1 (k)) = o(l) 



1/2 



and by using the definition (|4(J|) of L n (k), we have 

max E^^L" 1 ^)) =0(1). 

\<k<K n 

Finally we have 

max |E {(f(k) - f\{k))S n (k)L-\k)) \ = o(l). 



□ 



In this theorem, we have obtained an asymptotic expression of the mean squared prediction error of 
X n+ i(k), which holds uniformly for all 1 < k < K n . In the short memory case i.e. assuming that the 
process (Xt)t£Z is Gaussian, admits infinite moving average and autoregressive representations defined 
in ([6]), that the coefficie nts (a 7 ) 7g M verify X)j=i Vj| a jl < 00 an d that the coefficients bj are absolutely 
summable. ling and Weil ( 20031 ) proved that if K 2+s = 0(n) for some 5 > 0: 



lim max 

n— >+oo l<k<K n 



E ( X n+ \ — X n+ \ (k) ) -a 



L n (k) 
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with L n (k) defined as in ([4"0]h 

The term L n {k) has the same expression in the short memory case as in the long memory case. It is the 
sum of two terms: the first term (k/n)a^ is proportional to the order of the model and is a measure of 
the complexity of the predictor, the second term S 2 (k) corresponds to the goodness of fit of the model. 
This second term has not the same asymptotic behaviour in theshort and long memory case: for short 
memory time series it decays exponentially fast as a function of k, whereasfor long memory time series 
it has a Riemannian decay. 

In the following section, we will use the proof of Theorem [2] to obtain a central limit theorem for our 
predictor. 



4 Central limit theorem 



Like IBhansalil (jl978h and lLewis and Reinsell <jl9»fih for short memory processes, we search a normalisat 



ion 

factor to obtain a convergence in distribution of the difference between our predictor X n+ \ (K n ) and the 

+00 

Wiener-Kolmogorov predictor X n+ % = — ajX n+ i_j, which is the linear least-squares predictor based 

i=l 

on all the past. 

Theorem 3. Under assumptions H.l-H.4, if we choose the sequence (K n ) ne z such that: 

K* = O(n) and K^ +2d = o (n 1-M ) , (44) 

then 

' r X n+l - X n+1 (K n j) — * AT(0, 1). 



n— >+oo 



Proof. The difference between our predictor X n+ i(K n ) and the Wiener-Kolmogorov predictor X n+ \ is 
equal to: 

X n+1 - X n+1 (K n ) = f(K n ) + S n (K n ). (45) 

Since (Xt)tez is a Gaussian process with mean 0, (X^=i( a « — a«,A" n )^+i-i)tez is a Gaussian random 
variable with mean for any integer I . But (X^i=i( a « — a>i,K n )Xt+i-i)tEZ converges in mean-squared 
sense and thus in distribution to S n {K n ) as I tends to infinity. Then S n {K n ) is Gaussian random variable 
with mean 0. 

Consequently it is enough to prove that 

; 1 f{K n ) P > 0. 

First we search for a bound for 1/E[5 2 (K n )]. For all integer I, 



/I \ 2 1 

E ~ a i,K n )X n+ i-i > 27T/ ^(aj - a i<Kn ) 

\i=l J i=l 
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because the spectral density /, which is the Toepfit z symb ol of the covariance matrix, is bounded below 
by a positive constant / (see iGrenander and Szego (US)). By taking the limit as I — * +00, we obtain: 



E ^2( a i ~ a i,K n )X j+1 ^ > 2vr/ ^2(ai - a i>Kn f 



vi=l 



since a,i ; K n = when i > K n . Under assumption H.4, 

+00 .. 

i=K n +l 



i=l 
+00 



> 2vr/ Y, a 

i=K„+l 



K- 2d - l L 2 {K 



n-»+oo 1 + 2d 



(see Proposition 1.5.10 of Bingham et al. ( 19871 )). Then for any 5 > 0, there exists C > such that: 

1 



-,_ (r ?<CK«+™ (46) 

By introducing f\ defined in the proof of Theorem [21 we decompose the proof of the mean-squared 
convergence in two parts. We will first show that: 



tf{K n )-h{K n )) 



n— >+oo 







then 



' ==h{K n ) — ^ 0. 



(47) 



(48) 



More precisely we will prove the mean-squared convergence (|47p . using the decomposition in four 
terms ([53]) . (|54"|) . ([55]) and ([561) of proof of Lemma [3J] (see appendix). Using ([571) and (j4"6|) . the term 
([53]) verifies for any 5 > 0: 

r 1 n \ 2 

I . n-vn-l n— 1 x 

nK H) V n ^-i^„ + l 



E 



n—^/n—l 

^ Xj(if n )e-j + i - Xj( J K n )e ;? - +1 



V » 5/4 J" 



(49) 



Under assumption (|44p . the mean (|49p converges to 0. 

Similarly for the term (j54p using (|61|) and (j46p we obtain for any J > 0: 



E 



1 TmAt^ ^ n(K n )\^\K n )-t-\K n )\ J2 X.j{K n )e j+1 ) 

\v E K(^J] l j j=Kn J 



o 



n 

[n-K n + I) 



which converges to under assumption (|44p for sufficiently small 5. 
For the third term (155p . by (I66p and ([46h we obtain: 



E 



1 



n-l 



x£(tf n ) - X' n (K n )\ t- l (K n ) Y ^j(K n )e 3+1 ] = O 

j = Kn 



Kl +2d 



. 3-2d 

{n-K n + l) — 
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which converges to under assumption (|44j) . 

Finally the estimation of the fourth term (|56p is directly given in (|69p : 



E 



O 



K l+2d+5 

n 



(n-K n + l)i-2d-« 



which converges to under condition 
Now we will prove the mean-squared convergence (|48p. 
By (@3]): 



n - if n + 1 J , 2/ ^ ^ i^of 



E „ \ ff(K n ) 



K n a 2 J 1 v ""J n-^+co n — K n + 1 
Under condition (JUL bound (f46l) implies: 



1 K n a 2 F 

■ 0. 



E[cS2(K n )] n - K n + 1 n-+^ 
Then we have: 

lim ) E(ff(K n )) = 0. 



□ 



Remark 1 The normalisation in Theorem [3] is not an explicit function of K n . Nevertheless we have a 
good idea of the rate of decay of K~ l to 0. We have shown in (|46p that under assumption H.4 for all 
5>0: 

3C, CK- 2d - l ~ 5 <E[Sl(K n )}. 



In lGodetl (|2007al ) [Theorem 3.3.1], an upper bound for the rate of convergence is proved assuming H.1-H.2: 

3C, E[S 2 jK n )] < CK-\ 

For some processes, we even have an equivalent of E[5^(i^ n )]. Consider a fractionally integrated noise 
(Xt)tez, which is the stationary solution of the difference equation: 

(/ - B) d X t = e t 

where (et)tgz is a white noise with mean and constant finite variance a 2 and B is the backward-shift 
operator. In this case, the rate of convergence is given by: 



3C, E[S 2 n {K n 



CK~\ 
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Remark 2 In both the short and the long-memory case, the prediction error between our predictor 
and the Wiener-Kolmogorov predictor has the same expression L n (K n ) = -jr-a 2 + E[«S^(if n )]. But we 
do not use the same normalisation for central limit theorems. 

In the central limit theorem for short memory processes, we only know results for independent realisation 
prediction i.e. when the aim is to predict an ind epen dent series which has exac tly the same probabilistic 
structure as the observed one. Bhansali ( 19781 ) and Lewis and Reinsel ( 1985 ) proved a convergence in 
distribution of (x n+ i — X n+ i(K n ) J normalised by . / K n a 2 respectively in the univariate case and in the 



multivariate case. This normalisation corresponds to the complexity of the estimation of the projection 
coefficients. 

In the long memory case the normalisation yl£[«S^(i£^j] is given by the rate of convergence of the 
predictor knowing a finite past to the linear least-squares predictor knowing the infinite past. In the 
long memory case the rate of convergence due to the projection decays hyperbolically and is the main 
term of the global error of prediction L n (K n ). On the contrary in the short memory case, the rate of 
convergence due to the projection decays exponentially fast and is negligible with respect to the rate of 
convergence due to the estimation of the projection coefficients. 



5 Appendix 

5.1 Preliminary lemmas 

In the following lemmas we prove subsidiary asymptotic results, which we need in the proof of Theorem 

El 

Lemma 5.1. Assume H.2. If q > 1, then for all 5 > 0, there exists C constant such that for all 
1 < k< K„: 



E 



y/n 



j n—1 

===== x j( k )( £ j+i,k - 

K n + 1 f~Z 



<c(k(n-K n + l) 2d+s E(S n (k)) 



2 W 2 



(50) 



3=K n 

where the norm \\.\\ is defined in ([9]). 
Proof. We have 

===== Y X i( k )( £ j+hk -£j+i) = , _ „ = X i( k )^2( a i - ai,k)X j+1 -i. 



n-1 



n 



Without loss of generality, we assume that q > 2 since the result for q > 1 can be obtained from the 
result for q > 2 and Jensen's inequality. Observe that: 



. n— 1 +oo 



j=K n i=l 
/ fe-l / n-1 



2 X 9/2 



(n - K n + l)^ 2 [ Y E X J- 1 Y,^ ~ a i,k)X 3+1 - t 
1=0 \j=K n 



i=l 
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Since the function x \— > x q l 2 is convex on R + if g > 2, we obtain by Jensen's inequality: 

2x 9/2 



H-oo 



fc-1 / n-1 
Z=0 \j=Kn i=l 



fc-1 



Z=0 



n— 1 +oo 



j=ff n i=l 



Consequently 



E 



n 



^ n—l 

===== Yj X i( fc )( £ j + l,fc + 



fc-l 



■9/2 



«=0 



n—l +oo 

y~] -Xj-; y^( a « ~~ a «,fc)^i+i- 



i=l 



Furthermore 



i-i 



i=l 



i=l 



j+i — -^j+i — ^2 a i,kXj+i-i 

- a i,k a l+l- 



i=l 



for any integer I S [1, k] by definition of (ai t k)i<i<k- Since the mean defined in ([52]) is equal to 0, 



E (n - K n + 1) 



-9/2 



n-1 



+oo 



j=K n i=l 



E \ (n — K n + iy q ' 2 



n—l +oo 



+oo 



j=K n i=l V 



i=l 



And then by applying Theorem 1 of Ing and Wei ( 20031 ) to the random variable 



Q = YTj=K n x j-i EiSK - ai,k)Xj+i-i, we obtain: 



E (n - K n + 1) 



-9/2 



n-1 



y^ y^( a « _ ffli,fc)^j+i-) 

jr'=_fC n i=l 

9/2 



< c 



(_. n—l n—l \ 



<7*(s-i) =E 



'+oo 



'+oo 



y^(«i — ai,k)X s+ i-i J I ^(ai — a^kjXt+i- 



vi=l 



where cr*(.) is the autocovariance function of the process (Ylt^i ( a * ~~ a «,fc)^+i-«)tez *' e ' 
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As \a*(s-t)\ < a*(0), 



E\(n-K n + l)-"/ 2 



n-l 



+00 



< c 



< c 



j=K n i=l 
1 n-l n-l \ <?/ 2 

Ew-«)i 



i-j 



n - iT n + 

K-n^w E E w-« 



71— K n +1 n-Kn+l 



q/2 



n- K n + 



=1 t=l 



Under assumption H.2, we have for any 5 > 



n — K„ +l n—Kn+l 
E E k(--t)|<C(n-X B + l) a * t * + " 1 . 

s=l t=l 



Then we have shown that for all <5 > 0, 
E I (n - K n + I)""/ 2 



n— 1 +00 

E -^i-i E(°* — a i,k)Xj + i-, 

j=K n i=l 



<C((n-ET n + l) 2d+ V(0) 



9/2 



Notice that: 



a*(0) =E(S n (k)) 2 . 

And this remark allows us to conclude. 

Lemma 5.2. Assume that the assumptions of Theorem^ hold. Ifq>l, then for any 1 < k < 

j n— 1 



□ 



E 



n 



< Ck q ' 2 



with C independent of n and then of k. 



Proof. The arguments are similar to those used for verifying Lemma 15.11 Without loss of generality we 
assume that q > 2, since this result and Jensen's inequality allow to conclude for q > 1. Reasoning as 
for ([ST]) , we have by convexity: 



E 



n-l 



n 



< 



=^P==f E x i(*) £ i+i 
A « + 1 j=Kn 

Applying again Theorem 1 of ling and Wei $200$ ): 
(n-K n + I)- 9 / 2 



fc-i 



z=o 



n-l 



E 



E 



n-l 



E X i- ie i+ 1 

j=K n 



n— 1 n— 1 



s^U^biE E^-*v.(»-t) 



9/2 
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if s / t 

1 otherwise 



where cr £ (.) is the autocovariance function of the process (e t )tez he. 

a £ (s - t) = E(e t e s ) = 

We obtain: 

1 



E 



(n-K n + l)-^ 2 



n-l 

E X i- l£ i+ 1 

j=K n 



< c 



n - K n + 1 



[n-K n + l)a(O) 



9/2 



0(1) 



That concludes the proof. 



5.2 Proof of Lemma 13.11 



□ 



We recall that the constant C may have different values in the different equations but is always inde- 
pendent of n and then of k since we want a convergence for all 1 < k < K n . 
We decompose f(k) — fi(k) into 4 parts, which we estimate separately: 



n— \Av— 1 .. , 

f(k)-h(k) = x; , (fc)s- 1 (fc) ra _^ + 1 ( £ x^i+i - E x #) £ ;+i 



n-l 



j=K n 



^ n—l 

+x;'(fc) (s- 1 ^) - %-\k)) n _ Kn + 1 E x #)^+i 



j=K n 



n-l 



+ (x;' - x n (k)) ^n\k) n _x n + l E x #)^'+i 

^ n—l 

+XUfc)£; 1 (fc) ^_^ ^ X^k) 



(53) 
(54) 
(55) 
(56) 



j=K n 



Study of the term given in (|53p In this part we want to prove the mean-squared convergence 
to of: 



1 



n - K n + 1 

n-l 



n-l 



n—y/n—1 

E X #) £ i+i " E x #) £ i+i 

j=i^n j=K n 



(57) 



r- n-l 



J=n— y^n— 1 



20 



Holder's inequality applied twice with 1/p + 1/q = 1 and 1/p' + 1/q' = 1 gives: 



E 



< E 



< E 



7 i n_1 



j=n— y/n— 1 



2<f 



1/'/ 



E 



n— 1 



n 



2g'g\ V(9'3) 



\2p'q\ VCP'?) 



E 



2p\ 1/P 



n-1 



n 



j=n— ^/n— 1 



2p\ 1/P 



Under assumption H.3 for all p' and q': 

E\\X-\k)t^ 1/{P ' g) 



l^- 1 ^)!! 2 = o(i) 



(58) 



since the spectral density of the process (X t )t£Z admits a positive lower bound and then the largest 
eigenvalue of T,~ 1 (k) is bounded. Furthermore by the convexity of the function x i— > x 2q q and the 
stationarity of the process (stjtez, 

2q'q\ VC?'?) 



E 



2q'q\ V(9'9) 



< 



and by Lemma 2 of [wjl (| 19871 ): 



E 

V L 



E ». 

3=0 



3 C "-J 



E 



2 g' g \ V(9'9) 



'y/ii/2-K n 



< 



3=0 



< C 



(59) 



because the sequence (6^)j g N is summable. 



Finally by Lemma 15,2 



< 



C 



" Vn 
(n + l) 1 ^ 



2p\ 1/P 
]=n-yJn-\ / 



E 



n-l 



(n+1) 1 / 4 



j=n— yfn— 1 



2p\ VP 



1/1 



(60) 



By inequalities JSHJ), H55J and ([60]): 

E 

which converges to as n tends to infinity under assumption (|39p . 



\ j=n-y/n-l / 



< c 



Kn 
1/4 
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Study of the term given in (|5lj) Prove that: 

ln-K n + l, 



lim E 

n— >+oo 



n-l 



A: 



X^C^jE-^fcJ-E- 1 ^)] £x,(fc)^ +1 =0. 



Applying twice Holder's inequality, we have: 

ln-K n + 



E 



< E 



n-1 



A: 



2q 'g\ V(s'<Z) 



E 



E 



n 



1 \ 

_= £ X,(^ +1 



E-^^-E- 1 ^) 

2p\ VP 



Applying Lemma 15.21 we obtain that: 



E 



n 



I n— 1 



2p\ i/P 



< C7c < CK,, 



since k < K n . Now we derive the mean-squared convergence to when d s]0, l/2[. 
For d e]0, l/4[, we apply Theorem [T] and we get: 



E 



Then it follows from 



Tr\k)-%~\k) 
and (1551) that: 



2p'q\ V(p'9) 



< C 



iC 2 



n-K n + l 



E 



ra - if n + 1 



n-l 



K 3 



n - ET n + l 



which converges to if condition (|39p holds. 
If d G]l/4, l/2[, we obtain by Theorem [T] that 



E 



E~\k) -s- x (fc) 



V(P'9) 



< C 



i^L>-if n + l) 
(n - iT n + l) 2 " 4d 7 ' 



The inequalities (|59|) . (|62|) and (f6"4"j) allow us to conclude that: 



E 



j=K n J V 



(KlL 2 {n-K n + l) 



(n-K n + l) 2 ~ 4d 



which converges to under assumption (|39p . 



22 



For d = 1/4, applying Theorem (pQ) we obtain that 



E 



2 P > q \ 1/(^9) /^L 2 (n _ Kn + 1} log(n - K„ + 1) 



< c 



(n - + 1) 

The inequalities ([59]) . (|62l) and and (|65l) allow us to conclude that: 



(65) 



E 



\ 2 

n- K n + l , r v -ir,x ft-iml^YrMc I <- r> f ^" L2 ( ra ~ g " + jj lp g( n ~ -^n + 1) 

: x n (fc) p (fc) - E n (fc)j 2^ x i (fc)e i+1 < c l fra-y + n 

j=K„ J v n 



Study of the term given in ([55]) Prove that: 



lim E 

n— >+oo 



n. — K n + 1 r„„, 



n-l 



k 



X*;(k)-X' n (k)\ E~\k) Xj(k)e j+1 

j=K n 



0. 



(66) 



Using Holder's inequality twice, we have: 

E 



n - K n + 1 



n-l 



k 



< E 



E 



1 



2 ? '<A V(«'9) 



x;(fc)-x'jfc) 



n 



j n— 1 

=== x i(^)^+i 
A « + 1 j=Kn 



E 

2px 1/P 



2p'q\ V(P'9) 



In view of the convexity of the function x i— > and of the stationarity of the process (e^gz, we have: 



max E 
Kk<K n \ 



Vk 



X*'(£) - X' n (k) 



2q'q\ 



< 



2gV 



E 



^ bje n -j-i 



V \j=Vn/2-K n 



+1 



And by Lemma 2 of I Weil ([19871 ). we obtain 

1 



max E 
i<fc<K"„ \ 



X;'(fc)-X;(A;) 



2,',\ V(9'9) 



< c 



+oo 

\j=y/n/2-K n +l 



„ 2d— 1 



Then by Theorem [TJ 



E 



2p'q\ V(P'9) 



< c. 



(67) 
(68) 



By inequalities ([62]) . ([67]) and ([68]) . we then obtain: 
ln-K n + l 



n-l 



j = Kn 



E 



which converges to as n tends to infinity if condition f|39[) holds. 



2d-l 

n 2 
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Study of the term given in (|56|) We want to prove that: 



n-l 



l—^X' n (k)t- 1 (k) n _^ + i £ Xj(k) [e j+ i, k - s j+1 ] J = 0. 
Using Holder's inequality twice, we have: 



lim E 

«-+oo \ \l E (S n (k)Y 



j = Kn 



71—1 



E 



< 



1 ~ 1 ir— \ 

— ^X^AOS" 1 ^) — — - - Xj-C*;) [e i+ i jfc - e i+ i 

! ' ' 1 n j=K n 

Vn-k + l .^ Xj(fc)[£W " £j+l] 



E(S n (A;)) 
1 



(n-if n + l)E(«S n (fc)) 



E 



V 



j=K„ 



2 9 V 



E 



2p'q\ V(p'g) 



EX: 



/ ||2p 



1/p 



Applying Lemma l5.1|, we obtain for every 5 > 0: 

/ 2o'<A Vte'e) 

' ^ 71— 1 

===== ^ X j (k)[e j+1 , k -e j+1 ] 



E 



V 



Finally we choose p = 2 and we have: 



< C ( k(n -K n + 1) M+5 E {S n {k)f ) . (70) 



®\\KW\ 



1/2 




fc \ k k 



(69) 



since the process [Xt)t&, is Gaussian. Using the assumption H.2 on the covariances, we verify that for 
all 5 > 0: 

(e||x;(A:)|| 4 ) 1/2 < CVfc 4a! + 5 < cVfc (71) 
if d e ]0, l/4[. With these three inequalities (JBSJ, (J7DD, dTTJ) and 1 < fe < if„, we conclude that: 



V5 > 0, 



E 



[J 1 J xims-^fe) ^ V X,-(fc) (e, +lfc -e j+1 ) | 

I Y E(5„(A;)) 2 " V 7 n KJ n-K n + l.^ JVMj+i,* j+vi 



C- 



(n--K„ + l)E(S n (fc))' 



Vjfefc(n - if„ + 1) M+<5 E (S n (fc)) : 



< C 



A'. 



3/2 



(n- K n + \y- 2d - & ' 
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which converges to under condition (|39p . 

On the other hand if d £ [1/4, l/2[, inequality (|7ip becomes: 



1 /2 

> 0, ^E ||X' n (A;)|| 4 ) < Ck 4d+S (72) 
Using inequalities (|68|) . (jTOj) and (|72|) . we have for all J > 



E 



n -K n + 1 



,l-2d-<5 



which converges to under condition 
We have proved that for all d € ] 0, 5 [ 



lim max E I J -±-(f( k) - f x ( k)) 



2 



m+co 1<KK„ \ y L„(fc) 
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